Automatically Identifying Changes in the Semantic Orientation of Words
نویسندگان
چکیده
The meanings of words are not fixed but in fact undergo change, with new word senses arising and established senses taking on new aspects of meaning or falling out of usage. Two types of semantic change are amelioration and pejoration; in these processes a word sense changes to become more positive or negative, respectively. In this first computational study of amelioration and pejoration we adapt a web-based method for determining semantic orientation to the task of identifying ameliorations and pejorations in corpora from differing time periods. We evaluate our proposed method on a small dataset of known historical ameliorations and pejorations, and find it to perform better than a random baseline. Since this test dataset is small, we conduct a further evaluation on artificial examples of amelioration and pejoration, and again find evidence that our proposed method is able to identify changes in semantic orientation. Finally, we conduct a preliminary evaluation in which we apply our methods to the task of finding words which have recently undergone amelioration or pejoration. 1. Detecting changes in semantic orientation Word senses are continually evolving, with both new words and new senses of words arising almost daily. Systems for natural language processing tasks, such as question answering and automatic machine translation, often depend on lexicons for a variety of information, such as a word’s partsof-speech or meaning representation. When a sense of a word that is not recorded in a system’s lexicon is encountered in a text being processed, the system will typically fail to recognize the novel word sense as such, and then incorrectly draw on information from the lexical entry corresponding to some other sense of that word. The performance of the entire system will then likely suffer due to this incorrect lexical information. Ideally, a system could automatically identify novel word senses, and subsequently infer the necessary lexical information for the computational task at hand (e.g., the correct meaning representation for a novel word sense). Indeed, novel word senses present one of the most challenging phenomena in lexical acquisition (Zernik, 1991). New word senses also present challenges in lexicography where determining how established words and senses have changed is recognized as an important, and very difficult, problem (Simpson, 2007). Dictionaries covering current language must be updated to reflect new senses of words (and indeed new words themselves) that have come into usage, and also changes in the usage of established words and senses. Nowadays, vast quantities of text are produced each day in a variety of media including traditional publications such as newspapers and magazines, as well as newer types of communication such as blogs and micro-blogs (e.g., Twitter, http://twitter.com). Lexicographers must search this text for new word senses; however, given the amount of text that must be analyzed, it is simply not feasible to manually process it all (Barnhart, 1985). Therefore, automatic (or semi-automatic) methods for identifying changes in a word’s senses (such as new word senses) could be very helpful. Early approaches to detecting novel word senses that rely on rich lexical representations (e.g., Wilks, 1978) are not feasible in today’s context since such resources are not available for large-scale vocabularies. However, words often change meaning in regular ways (Campbell, 2004), and this insight can be leveraged in computational systems. For example, Sagi et al. (2009) exploit knowledge of semantic widening and narrowing (extension and restriction of meaning) to automatically identify words which have undergone these changes. Although preliminary, their results suggest that focusing on specific types of semantic change is a promising direction for detecting new word senses. One aspect of word-level semantics of great interest today is semantic orientation. Much recent computational work has looked at determining the sentiment or opinion expressed in some text (see Pang and Lee (2008) for an overview). A key aspect of many sentiment analysis systems is a lexicon in which words or senses are annotated with semantic orientation. Such lexicons are often manually-crafted (e.g., the General Inquirer, Stone et al., 1966). However, it is clearly important to have automatic methods to detect semantic changes that affect a word’s orientation in order to keep such lexicons up-to-date. Indeed, there have been recent efforts to automatically infer polarity lexicons from corpora (e.g., Hatzivassiloglou and McKeown, 1997; Turney and Littman, 2003) and from other lexicons (e.g., Esuli and Sebastiani, 2006; Mohammad et al., 2009), and to adapt existing polarity lexicons to specific domains (e.g., Choi and Cardie, 2009). Similarly, since appropriate usage of words depends on knowledge of their semantic orientation, tools for detecting such changes would be helpful for lexicographers in updating dictionaries. Furthermore, diachronic studies in corpus linguistics have— to the best of our knowledge—not considered changes in the polarity of words, and have instead focused on topics such as changes in word frequency (e.g. Hilpert and Gries, 2009). We focus here on amelioration and pejoration, common linguistic processes through which the meaning of a word changes to have a more positive or negative orientation.
منابع مشابه
Identifying the Semantic Orientation of Foreign Words
We present a method for identifying the positive or negative semantic orientation of foreign words. Identifying the semantic orientation of words has numerous applications in the areas of text classification, analysis of product review, analysis of responses to surveys, and mining online discussions. Identifying the semantic orientation of English words has been extensively studied in literatur...
متن کاملPredicting the Semantic Orientation of Adjectives
We identify and validate from a large corpus constraints from conjunctions on the positive or negative semantic orientation of the conjoined adjectives. A log-linear regression model uses these constraints to predict whether conjoined adjectives are of same or different orientations, achieving 82% accuracy in this task when each conjunction is considered independently. Combining the constraints...
متن کاملPredicting the Semantic Orientation of Adjectives
We identify and validate from a large corpus constraints from conjunctions on the positive or negative semantic orientation of the conjoined adjectives. A log-linear regression model uses these constraints to predict whether conjoined adjectives are of same or different orientations, achieving 82% accuracy in this task when each conjunction is considered independently. Combining the constraints...
متن کاملIdentifying the semantic orientation of terms using S-HAL for sentiment analysis
0950-7051/$ see front matter 2012 Elsevier B.V. A http://dx.doi.org/10.1016/j.knosys.2012.04.011 ⇑ Corresponding author at: MOE Key Laboratory Network Security, Xi’an Jiaotong University, Xi’an 71 82667964. E-mail addresses: [email protected] (T. Xu (Q. Peng). Sentiment analysis continues to be a most important research problem due to its abundant applications. Identifying the semantic orien...
متن کاملThe Semantic and Rhetorical Function of the Synonymous and Antonymous Concepts of “Infaq” in the Holy Quran
The syntagmatic (descriptive) semantic approach is an attempt to represent the words and their relations existing in the human mind. Considering this idea, the present paper, while applying this approach, seeks to provide a descriptive analysis of the concept of infaq and to explain the semantic and rhetorical function of the concepts that having a syntagmatic relation with it are sometimes use...
متن کامل